November 15, 2018

What the hell is Github?

And git

Version control system for data

Logs commits of file changes retrievable at any time

Applications

Version control

Collaborate

Storage for every possible file type, e.g. Supp Material

Dynamic loading of stored links and programs

require(RCurl)
script <- getURL("https://raw.githubusercontent.com/darwinanddavis")
eval(parse(text = script))

Fork and clone a plethora of public data, code, material

But why?

Reproducible

Unlimited

Transparent

Shareable

Best practice for git prep

Avoid spaces and CamelCase

  • e.g. 'my data.csv', 'My Data.csv' POOR
  • e.g. 'mydata.csv', 'my_data.csv' GOOD

Annotation

p <- rep(rnorm(100),20) # this is well annotated code  

Tab is your friend

Useful syntax

cd change working dir. cd .. move one level up

pwd print current working dir

ls list files in working dir

mkdir newfolder make new working dir

touch text.txt create new file

More useful syntax

cp source destination
copy files from source to destination. e.g. cp /Users/mydir/README.txt ~/Documents

cp -R source destination
copy all folders, subfolders, and files from source to destination

mv source destination
move files or folders from source to destination (no need for -R)

cp ~/Desktop/*.rtf ~/Documents
move multiple files with the * wildcard, which copies all .rtf files. The tilde (~) symbol is a shortcut for your Home folder, which contains '/Desktop'.

mv ~/Desktop/MyFile.rtf ~/Desktop/MyFile-old.rtf
cp ~/Desktop/MyFile.rtf ~/Documents/MyFile-old.rtf
rename files

Let's git it

Initialising and using your repo

1. Create a repo

2. Create and stage your files

- add and commit your files

3. Push to a remote github repo

- push your files

1. Create a repo

Open Terminal/cmd

cd ~/Documents/ # change working dir     
ls # list dir contents      

Open Finder/Windows. Make a new project on your local comp.

# create new project  
cd ~/Documents
# create new file 
touch test.txt  
open test.txt  
# make a new folder  
mkdir newgit  
# navigate to that folder  
cd newgit
ls -a  

1. Create a repo (cont …)

Create a new file in the command line

# navigate to your new git repo  
pwd  
cd ~/Documents/newgit

# move the new file into the git repo      
mv ~/Documents/test.txt ~/Documents/newgit
ls  

Initialise your new local repo

# init git
git init  

2. Create and stage your files

Add the files in your folder to the local git repo

# add the files to the git  
git add . # the '.' adds everything 
git add test.txt # adds individual files  
git status # check what git is doing   

Stage the files for the commit

# add the files to the git  
git commit -m 'init commit' # -m adds a message  

We've now added and staged files to a local repo. Version control!

Let's check the changes

git log # recent git activity

3. Push to a remote github repo

Now we push the changes we made from our local repo to our Github cloud.

  1. Create a new Github repo. Name is using best practise, e.g. no spaces
  2. Don't create a README
  • Uncheck the box 'Initialise this Github with a README'

3. Push to a remote github repo (cont …)

First, copy the Github repo link you want to push to. Select either https or SSH (requires key access).

3. Push to a remote github repo (cont …)

Then push your staged (commit) files from your local repo to the remote repo

# set the new remote repo
git remote set-url origin "your github repo"  
# see what remote repo you have
git remote -v  
# push changes from local repo to remote repo 
git push -u origin master

That's it!

Your data is now stored and version controlled
in local and remote repos

Troubleshooting for previous steps

fatal: remote origin already exists
The remote origin already exists, so you can't add it again

git remote rm origin # if origin already exists, remove it
git remote add origin "your github repo" # then re-add 
git push origin master # then push again  

! [rejected] master -> master (non-fast-forward) Someone else has made changes since your latest ones and git refuses to lose the commit, so won't push your new changes

git pull origin master # fetches any updates to online repo and merges them    

fatal: refusing to merge unrelated histories Usually associated with a README file on the Github repo

git pull origin master --allow-unrelated-histories # unnecessary parallel history 
# merged to your project. usually associated with a README.md file

If VIM opens, type 'SHIFT + :', then press ENTER

Cloning an existing repo

Clone a remote repo to your local computer

This creates a git repository on your local machine complete with version control.

Every version of every file for the history of the project is grabbed by default when you run git clone.

git clone "github url" "new repo name (optional)"
# e.g. git clone https://github.com/darwinanddavis/UsefulCode mynewrepo 

Why clone?

You can dump the contents of any public repo, including its complete version history, onto your own computer, then upload it onto the cloud.

The short version

local git (version control on your comp)

git init # initialise your local git  
git add . # adds all files to git. replace '.' with filename for individ files
git commit -m 'redo intro' # '-m' = message 

remote git (version control on your github)

# after the above steps ^ 
# see what remote repo you have. if a github one exists, you can just push
git remote -v  
# set the new remote repo (if necessary)  
git remote add origin "your github repo"  # if remote branch doesn't exist
git remote set-url origin "your github repo"  # if already exists
# push changes from local repo to remote repo 
git push -u origin master

If in doubt, ask the internet

Troubleshooting

Staging and pushing files

Re-do a commit

git reset --soft HEAD~1

Alternative push option

# option 1
git remote set-url origin "link to existing github repo" # talk to github 
git push -u origin master
# option 2
git remote add github "your github repo"  # if remote branch doesn't exist
git push -u github master

Staging and pushing files (cont …)

After pushing to your remote repo and this error appears:
! [rejected] master -> master (fetch first)

git fetch origin master # match the local repo commit status to the push destination     
git merge master # merge the recent commits    
git push -u origin master # push to remote repo  

# for non-fast-forward error  
git fetch origin master:tmp
git rebase tmp
git push origin HEAD:master
git branch -D tmp
git push -u origin master 

Staging and pushing files (cont …)

For fatal: refusing to merge unrelated histories error

git checkout master
git merge origin/master --allow-unrelated-histories
# or run this before your 'git pull origin master' command  
git pull --allow-unrelated-histories origin master 

Delete files from a Github repo

# ensure you are in the default branch:
git checkout master
# the rm -r command will recursively remove your folder:
git rm -r folder-name
#Commit the change:
git commit -m "Remove duplicated directory"
# push the change to your remote repo
git push origin master
Already on 'master'
M   .Rproj.user/ABE7B653/console06/587CA191
M   github_presentation.Rmd
Your branch is up-to-date with 'origin/master'.
fatal: pathspec 'folder-name' did not match any files
On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
    modified:   .Rproj.user/ABE7B653/console06/587CA191
    modified:   github_presentation.Rmd

no changes added to commit
Everything up-to-date

Accessibility

If Github questions your user credentials.

git config --global user.email "<your email>" 
git config --global user.name "<your github user name>" 

When using SSH for your github remote repo, e.g. git@github.com:username/reponame.git

Generating a new SSH key

Accessing your SSH key:
- In Mac, in Terminal, type

cat ~/.ssh/id_rsa.pub  
  • In Windows, in cmd, type
ls ~/.ssh/*.pub   

Accessing commits

There's an alternative

1. Go to the repo loading page on your Github

2. Drag and drop the file/s onto the screen

3. Add a commit message, e.g. 'init' 'updated table2', etc

Why am I telling you this?

I'd rather you move towards open-access and reproducible research than be deterred by the git user experience.

References

Notes for improvement

  • Make glossary of git notes and terms
  • Make Windows specfic commands
  • Highlight there are spaces between words in terminal
  • Use gitbash
  • start with github not local repo